National Repository of Grey Literature 14 records found  1 - 10next  jump to record: Search took 0.00 seconds. 
Implementace metod pro automatické dolování dat
Smatana, Peter
This thesis describes the process of data mining on exercise Give Me Some Credit. Work carry out reader through the entire process of data mining. In the introduction is explain issue what is data mining and what data will be mine. Further is introduced tool for cleaning and preparing data. At the end modeling is done and evaluation of the results.
Performance Ranking of Czech Credit Scoring Models
Smolár, Peter ; Havránek, Tomáš (advisor) ; Jakubík, Petr (referee)
This thesis provides a comprehensive ranking of 11 Czech statistical and 4 foreign credit scoring models. The ranking is based on the predictive performance of individual models, as measured by the area under curve, evaluated on a randomly sampled set of 250 training and validation samples. After establishing a baseline comparison, 3 avenues of estimation setup optimization are explored, namely missing value treatment, estimation method and the use of additional non-financial variables. After being optimized, the models are once again ranked based on their predictive performance. Statistical inference is drawn using ANOVA and the Friedman test, along with the corresponding Tukey and Nemeyi pos-hoc tests. In their baseline form, the Czech credit scoring models are found to be outperformed by the foreign benchmark model. Treating the missing values by OLS imputation and estimating the models by probit, significantly is found to significantly improve their predictive performance. In their optimized form, the difference in predictive performance between Czech and foreign credit scoring model is found to be only marginal. JEL Classification G28, G32, G33, G38 Keywords credit scoring, multiple discriminant analysis, logit analysis, probit analysis Author's e-mail 71247263@fsv.cuni.cz Supervisor's e-mail...
The future of credit scoring modelling using advanced techniques
Čermáková, Jolana ; Krištoufek, Ladislav (advisor) ; Geršl, Adam (referee)
Machine learning is becoming a part of everyday life and has an indisputable impact across large array of industries. In the financial industry, this impact lies particularly in predictive modelling. The goal of this thesis is to describe the basic principles of artificial intelligence and its subset, machine learning. The most widely used machine learning techniques are outlined both in a theoretical and a practical way. As a result, four models were assembled within the thesis. Results and limitations of each model were discussed and these models were also mutually compared based on their individual per- formance. The evaluation was executed on a real world dataset, provided by Home Credit company. Final performance of machine learning methods, measured by the KS and GINI metrics, was either very comparable or even worse than the performance of a traditional logistic regression. Still, the problem may lie in an insu cient dataset, in the improper data prepara- tion, or in inappropriately used algorithms, not necessarily in the models themselves.
Machine Learning for Credit Scoring
Myazina, Elena ; Pilát, Martin (advisor) ; Neruda, Roman (referee)
Title: Machine Learning for Credit Scoring Author: Elena Myazina Department / Institute: Department of Theoretical Computer Science and Mathematical Logic Supervisor of the master thesis: Mgr. Martin Pilát, Ph.D, Department of Theoretical Computer Science and Mathematical Logic Abstract: Credit scoring is a technique used by banks to evaluate their clients who ask for different types of loan. Its goal is to predict, whether a given client will pay their loan or not. Traditionally, mathematical models based on logistic regression are used for this task. In this thesis, we approach the problem of credit scoring from a machine learning point of view. We investigate several machine learning methods (including neural networks, random forests, support vector machines and other), and evaluate their performance for the credit scoring task on three publicly available datasets.. Keywords: machine learning, credit scoring,logistic regression, neural networks, random forest
Performance Analysis of Credit Scoring Models on Lending Club Data
Polena, Michal ; Teplý, Petr (advisor) ; Pečená, Magda (referee)
In our master thesis, we compare ten classification algorithms for credit scor- ing. Their prediction performances are measured by six different classification performance measurements. We use a unique P2P lending data set with more than 200,000 records and 23 variables for our classifiers comparison. This data set comes from Lending Club, the biggest P2P lending platform in the United States. Logistic regression, Artificial neural network, and Linear discriminant analysis are the best three classifiers according to our results. Random forest ranks as the fifth best classifier. On the other hand, Classification and regression tree and k-Nearest neighbors are ranked as the worse classifiers in our ranking. 1
Causes of segregation in microregions
Illmannová, Anne ; Sieber, Martina (advisor) ; Vlček, Josef (referee)
This bachelor thesis concerns the repayment of bank loans. The theoretical part of the thesis deals with the history of banking, its contemporary appearance and sorts of bank financial loan products. Special attention is paid to mortgages and conditions of their acquisition. The practical part of the thesis concerns the confrontation of probability of default in dependence on different indicators on regional and afterwards on district level. The aim of the thesis is to explane on a concrete case from Prague, which sociodemografic characteristics have influence on the payment moral of the debtors, what the differeces are between areas with people who have good and bad payment moral and if the domicile has influence on the level of payment moral. Keywords Banking, mortgage, credit scoring, probability od default.
Building credit scoring models using selected statistical methods in R
Jánoš, Andrej ; Bašta, Milan (advisor) ; Pecáková, Iva (referee)
Credit scoring is important and rapidly developing discipline. The aim of this thesis is to describe basic methods used for building and interpretation of the credit scoring models with an example of application of these methods for designing such models using statistical software R. This thesis is organized into five chapters. In chapter one, the term of credit scoring is explained with main examples of its application and motivation for studying this topic. In the next chapters, three in financial practice most often used methods for building credit scoring models are introduced. In chapter two, the most developed one, logistic regression is discussed. The main emphasis is put on the logistic regression model, which is characterized from a mathematical point of view and also various ways to assess the quality of the model are presented. The other two methods presented in this thesis are decision trees and Random forests, these methods are covered by chapters three and four. An important part of this thesis is a detailed application of the described models to a specific data set Default using the R program. The final fifth chapter is a practical demonstration of building credit scoring models, their diagnostics and subsequent evaluation of their applicability in practice using R. The appendices include used R code and also functions developed for testing of the final model and code used through the thesis. The key aspect of the work is to provide enough theoretical knowledge and practical skills for a reader to fully understand the mentioned models and to be able to apply them in practice.
Selection Bias Reduction in Credit Scoring Models
Ditrich, Josef ; Hebák, Petr (advisor) ; Pecáková, Iva (referee) ; Zamrazilová, Eva (referee)
Nowadays, the use of credit scoring models in the financial sector is a common practice. Credit scoring plays an important role in profitability and transparency of lending business. Given the high credit volumes, even a small improvement of discriminatory and predictive power of a credit scoring model may provide a substantial additional profit. Scoring models are applied on the through-the-door population, however, for creating them or adjusting already existing credit rules, it is usual to use only the data corresponding to accepted applicants for which payment discipline can be observed. This discrepancy can lead to reject bias (or selection bias in general). Methods trying to eliminate or reduce this phenomenon are known by the term reject inference. In general, these methods try to assess the behavior of rejected applicants or to obtain an additional information about them. In the dissertation thesis, I dealt with the enlargement method which is based on a random acceptance of applicants that would have been rejected. This method is not only time consuming but also expensive. Therefore I looked for the ways how to reduce the cost of acquiring additional information about rejected applicants. As a result, I have proposed a modification which I called the enlargement method with sorting variable. It was validated on real bank database with two possible sorting variables and the results were compared with the original version of the method. It was shown that both tested approaches can reduce its cost while retaining the accuracy of the scoring models.
Implementace metod pro automatické dolování dat
Smatana, Peter
This thesis describes the process of data mining on exercise Give Me Some Credit. Work carry out reader through the entire process of data mining. In the introduction is explain issue what is data mining and what data will be mine. Further is introduced tool for cleaning and preparing data. At the end modeling is done and evaluation of the results.

National Repository of Grey Literature : 14 records found   1 - 10next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.